MONDAY MORNING, 2 DECEMBER 1996 



KAUAI ROOM, 10:15 A.M. TO 12:05 P.M. 



2564 



Session laAO 



Acoustical Oceanography: Acoustical Ocean Monitoring: Determination of Current and 

Temperature Fields I 

David R. Palmer, Cochair 

NOAA/AOML , 4301 Rickenbacker Causeway, Miami, Florida 33149 
Iwao Nakano, Cochair 

Ocean Research Department, Japan Marine Science and Technology Center, 2-15 Natsushima-cho, Yokosuka, 

Kanagawa, 237 Japan 

Chair’s Introduction — 10:15 

Invited Papers 



10:20 

laAOl. Monitoring of the transport of the ocean current by an acoustic transceiver array. Tomoyoshi Takeuchi (Univ. of 
Electro-Communications, 1-5-1, Chofu, Tokyo, 182 Japan) and Keisuke Taira (Univ. of Tokyo 1-15-1, Minamidai, Nakano-Ku, 
Tokyo, 164 Japan) 

Current velocity can be determined by measuring the difference of the sound wave traveling time between two points. The 
multipaths inverted echo sounder (MIES) [Takeuchi et al., J. Acoust. Soc. Jpn. 49, 543-550 (1993)] was developed in order to apply 
this method to the measurement of the volume transport of the Kuroshio. Measurement of mean current velocity can be made with a 
difference of reciprocal travel times of the sound wave along two sides of a triangle in a vertical plane, which is constructed by the 
acoustic paths with a base side of the mooring distance when two multipath inverted echosounders were deployed 10 km apart. The 
measurement of volume transport was attempted by applying the above method, where three multipath inverted echosounders were 
deployed so as to construct a regular triangle on the sea bottom. In this paper, the result of the measurement of the volume transport 
of the Kuroshio over Izu Ridge from 20 March to 25 April 1995 is presented. 



10:35 

laA02. Ocean current and vorticity measurements using long-range reciprocal acoustic transmissions. Peter F. Worcester 
(Scripps Inst, of Oceanogr., Univ. of California at San Diego, La Jolla, CA 92093) 

Measurements of the sum and difference of the travel times of acoustic pulses propagating in opposite directions provide powerful 
tomographic tools. Travel-time signals due to sound-speed perturbations cancel in the difference travel time, leaving only the much 
smaller signals due to currents. Reciprocal acoustic transmissions are particularly well suited to measuring large-scale barotropic flow 
and areal-average relative vorticity, both of which are of great oceanographic importance, but difficult to measure using other 
techniques. The use of reciprocal transmissions to measure ocean currents and vorticity has been tested in a series of experiments at 
steadily increasing ranges, from 25 to 1275 km. Acoustically derived currents and vorticity have been found to be consistent with 
independent measurements in all cases. The barotropic tides have provided the best test signals, because they are both well known and 
of large scale. Point measurements provided by current meters provide less stringent tests. The high-frequency travel-time fluctuations 
due to internal- wave-induced sound-speed perturbations have been found to largely cancel in the differential times, demonstrating that 
the ray paths of oppositely traveling signals are nearly reciprocal out to 1-Mm ranges, as is implicitly assumed when differential travel 
times are used to deduce ocean currents. 



10:50 

laA03. Ocean current effects on a low-frequency acoustic field and the feasibility of their use to monitor ocean dynamics. 

Oleg A. Godin (School of Earth and Ocean Sci., Univ. of Victoria, P.O. Box 1700, Victoria, BC V8W 2Y2, Canada) 

Ray-travel-time nonreciprocity has been used in most tomography experiments to determine current velocity in the ocean by 
acoustic means at distances large compared to ocean depth. Although very successful in deep water, this approach is not applicable in 
the coastal ocean where ray arrivals are not separable and/or identifiable due to multiple bottom interactions. In this paper, other 
parameters of the acoustic field, including normal mode travel time, ray and mode horizontal refraction angles, and full field phase, 
are treated as possible data for the current velocity field. Some qualitative differences between acoustic fields in moving and motionless 
fluid are indicated and their importance for acoustic monitoring of ocean currents is emphasized. Existing mathematical models of 
low-frequency underwater sound propagation in a moving ocean are discussed. It is demonstrated that nonreciprocity of various 
acoustic field variables possess quite a different sensitivity to the flow velocity field and robustness with respect to unavoidable 
uncertainties in our knowledge of system and environmental parameters. Full-field inversion, based on an appropriate acoustic field 
variable, is concluded to be a promising technique for current velocity remote sensing in the coastal ocean environments. [Work 
supported by NSERC and RBRF.] 

J. Acoust. Soc. Am., Vol. 100, No. 4, Pt. 2, Oct. 1996 3rd Joint Meeting: Acoustical Societies of America and Japan 



2564 



Downloaded 1 1 Jul 2013 to 128.95.155.147. Redistribution subject to ASA license or copyright; see http://asadl.org/terms 




11:05 



laA04. Measuring currents with acoustic propagation. David M. Farmer (Inst, of Ocean Sci., P.O. Box 6000, Sidney, BC, 
Canada) 

Water currents refract and otherwise modify acoustic signals providing a basis for their remote detection. In recent years this 
concept has been explored with high-frequency propagation in the coastal environment, but the concepts have a broader application 
to ocean measurement. The component of flow resolved along the acoustic path may be detected through changes in effective sound 
speed. Phase coherent reciprocal transmission allows separation of current-induced effects from scalar contributions due to temperature 
or salinity, leading to measurements of great accuracy and precision. For turbulent flows this approach has been extended into the 
inertial subrange, permitting measurement of path averaged dissipation. In contrast, coherence of the refractive variability permits 
measurement of the component of flow orthogonal to the acoustic path through detection of scintillation drift with an array. Addi- 
tionally, current advection of the pulse may be detected as a change in horizontal arrival angle. Resolution of both current speed and 
refractive variability as a function of path position can be achieved by combining multielement source and receiver arrays in a spatial 
aperture filter. These concepts are now being applied in the Bosphorus to the measurement of exchange with the Black Sea using a 
two-level reciprocal scintillation array. 



11:20 

laA05. An approach to the coastal ocean acoustic tomography. Arata Kaneko, Hong Zheng (Dept, of Environ. Sci., Faculty of 
Eng., Hiroshima Univ., Higashi-Hiroshima, 739 Japan), and Hideaki Noguchi (Chugoku Natl. Industrial Res. Inst., Kure, 731-01 
Japan) 

A reciprocal sound transmission system has been designed for long-term current measurements in the coastal sea with heavy ship 
traffic and fishing activities. The system was composed of two acoustic stations spaced a distance of 5.7 km on both sides of a channel 
in the Seto Inland Sea, Japan. Each station was equipped with a transmitter, hydrophone, and GPS receiver. Reciprocal sound 
transmission experiments between the two stations were successfully completed for 5 h, using a carrier of 1 1 kHz, modified with the 
M sequence of 10th order. The time coordinate at both stations was synchronized with the accuracy of 0.1 /xs by the 1-Hz and 1-kHz 
time signals of GPS. Range-averaged current velocities, estimated from the travel time data obtained reciprocally, were in good 
agreement with the results of the ADCP measurement obtained along the sound transmission line. A 10.6-km sound transmission 
experiment using the carrier of 11.0 kHz and the M sequence of 10th order was also done successfully in an adjacent channel of the 
Seto Inland Sea. The present sound transmission system can easily be extended to a coastal tomography system composed of an array 
of acoustic stations. 



Contributed Papers 



11:35 

laA06. Observation of barotropic-tide relative vorticity in the 
northwest Atlantic. Brian D. Dushaw, Bruce M. Howe (A. P. L., Univ. 
of Washington, 1013 NE 40th St., Seattle, WA 98105-6698), Peter F. 
Worcester, Bruce D. Cornuelle (Univ. of California, La Jolla, CA 
92093-0213), and Kurt Metzger (Univ. of Michigan, Ann Arbor, MI 
48109) 

Time series of reciprocal ray travel times were obtained at 350-, 410-, 
and 670-km ranges in the western North Atlantic during the 1991-1992 
Acoustic Mid-Ocean Dynamics Experiment (AMODE). Transmissions 
were recorded for approximately 300 days between six transceivers in a 
pentagonal array. Barotropic current along each of the 15 propagation 
paths is derived from the difference of reciprocal ray travel times, while 
ten independent estimates of areal-averaged relative vorticity are found by 
integrating current around triangles in the pentagonal array. The estimated 
tidal currents are highly accurate, and tidal relative vorticity at the M 2 
frequency is detected. This vorticity is induced primarily by the stretching 
of vortex lines by tidal elevation. Harmonic constants (amplitude, phase) 
of M 2 tidal vorticity are about (4-8±2XlO -9 s -1 , 270° — 320° 
±20°), while harmonic constants of about (2-3X10 -9 s -1 , 

300° — 340°) are predicted using the shallow-water equations. The mea- 
sured tidal harmonic constants are compared with those derived from a 
global barotropic tidal model. 



11:50 

laA07. Acoustic observations of Mediterranean flow into the Black 
Sea. Daniela Di Iorio and Tuncay Akal (SACLANT Undersea Res. Ctr., 
Viale San Bartolomeo 400, 19138 La Spezia, Italy) 

The physical behavior of the Mediterranean flow entering the Black 
Sea through the Bosphorus Strait is described using a variety of high- 
frequency acoustic systems. Because of the density difference between 
salty Mediterranean and fresh Black Sea water, a two-layer exchange is 
formed which is confined within a canyon in the Black Sea exit region of 
the Bosphorus Strait. A 307-kHz acoustical scintillation system placed 6 m 
from the seafloor and covering a 300-m propagation path is used to de- 
scribe the mean Mediterranean current speed and the turbulent velocity 
fluctuations within the bottom boundary layer of the Mediterranean flow 
during a 4-day period when the exchange was maximal. In the idealized 
case of isotropic and homogenous turbulence, estimates of the turbulent 
kinetic energy dissipation rate leads to values ranging from 1X10 -6 to 
5 X 10 -5 W/kg _ 1 . A 600-kHz broadband acoustic Doppler current profiler 
placed within the canyon shows that the two-layer exchange displays tem- 
poral variability over scales of a few days associated with the meteoro- 
logical conditions in the Black Sea. To help interpret the oceanographic 
measurements, a 120-kHz high-resolution echo sounder is used to obtain 
two-dimensional images of the two-layer exchange. 



2565 



J. Acoust. Soc. Am., Vol. 100, No. 4, Pt. 2, Oct. 1996 



3rd Joint Meeting: Acoustical Societies of America and Japan 



2565 



Downloaded 1 1 Jul 2013 to 128.95.155.147. Redistribution subject to ASA license or copyright; see http://asadl.org/terms 



a MON. AM 



MONDAY MORNING, 2 DECEMBER 1996 



WAIANAE ROOM, 10:15 A.M. TO 12:00 NOON 



Session laNS 



Noise: Booms, Jets, Ducts and More 

Mary M. Prince, Cochair 

Centers for Disease Control and Prevention, National Institute of Occupational Safety and Health, 4676 Columbia Parkway, 

Cincinnati, Ohio 45226 

Ichiro Yamada, Cochair 

Kobayashi Institute of Physical Research, 3-20-41 Higashi-motomachi, Kokubunji, Tokyo, 185 Japan 

Contributed Papers 



10:15 

laNSl. Validation of launch vehicle sonic boom predictions. J. 

Micah Downing (USAF Armstrong Lab., Wright- Patterson AFB, OH 
45433) and Kenneth Plotkin (Wyle Labs., Inc., Arlington, VA 22202) 

Concern has been raised about the sonic boom levels generated by 
launch vehicle operations. To address this concern the USAF recently 
monitored the sonic boom generated by the launch of a Titan IV rocket and 
the reaction of the local wildlife. The sonic boom generated by this launch 
intersected the Channel Islands off the coast of Southern California. The 
sonic booms were recorded by the USAF Boom Event Analyzer Recorder 
and by DATs with low-frequency microphones on San Miguel, Santa Rosa, 
and Santa Cruz islands. Before the launch, predictions were made to assess 
the sonic boom impact on the islands. The prediction model used in this 
comparison was PCBoom3, a single-event sonic boom model developed 
by the USAF. This model has been updated to model sonic booms gener- 
ated by launch vehicles and includes the effect of the rocket plume on the 
generation of the sonic boom. The measured sonic boom signatures are 
compared to predictions. 



10:30 

laNS2. Sonic-boom noise penetration into the ocean: 1996 update. 

Victor W. Sparrow, Judith L. Rochat, and Tracie J. Ferguson (Grad. Prog. 
Acoust., Penn State, 157 Hammond Bldg., University Park, PA 16802) 

Recently, there has been substantial progress in predicting the penetra- 
tion of sonic boom noise into the ocean. Such noise penetration occurs for 
either commercial or military supersonic aircraft operating over the ocean. 
As previously discussed [J. Acoust. Soc. Am. 97, 3258(A) (1995)], one can 
use analytical techniques to make predictions of the noise penetration, but 
eventually finite difference calculations become the method of choice 
when accounting for wind wave swell and ocean inhomogeneities. Two- 
dimensional calculations already indicate that typical wind wave swell can 
focus or defocus a penetrating rounded sonic boom waveform up to ap- 
proximately ±1.5 dB for a homogeneous ocean below a homogeneous 
atmosphere. Higher Mach number supersonic flight accentuates the focus- 
ing. Predictions are currently being sought for more realistic sonic boom 
waveform shapes, three-dimensional interactions between the incident 
boom and the ocean swell, and the effects of bubble plumes immediately 
below the ocean surface. One simulation of the focusing from ocean swell 
will be visualized via a videotape. [Work supported by NASA Langley 
Research Center, under Grant NAG 1-1638, and by Armstrong Labora- 
tory, Air Force Material Command, USAF, under Grant F4 1624 -96-1- 
0003.] 

2566 



10:45 

laNS3. Reduction of noise radiated from supersonic jets. Yoshikuni 
Umeda and Ryuji Ishii (Div. of Aeronaut, and Astronautics, Dept, of Eng. 
Sci., Kyoto Univ., Yoshida Hon-Machi, Sakyo-ku, Kyoto, 606 Japan) 

In the present investigation, sound-pressure levels radiated from the 
twin- and multijet with a square configuration were measured at various 
pressure ratios R from 2.00 to 6.33. The nozzle diameter was 5 mm and the 
center-to-center spacing of the nozzles was fixed to 1.4 times the nozzle 
diameter. From this experiment, it is found that the overall sound-pressure 
level (OASPL) radiated from the multijet with a square configuration be- 
comes lower than that from the twin jet for the pressure ratio range R 
> 4 . 0 . In the near sound field upstream of the nozzle exit, the difference of 
the OASPL between the twin- and multijet with a square configuration 
reaches a maximum (about 7 dB) at the pressure ratio about R = 5.0. This 
decreasing of the OASPL may be caused by the shielding of the sound 
waves radiated from the inner part of the multijet with a square configu- 
ration by the surrounding four jets, and by stabilizing the jet oscillation due 
to the appropriate coupling process among four jets. [Work supported by 
the Ministry of Education and Culture of Japan (Grant No. C2-08650201).] 

11:00 

laNS4. Exact solutions for sound radiation from a circular duct 
with flows. Y. C. Cho (NASA Ames Res. Ctr., MS 269-3, Moffett Field, 
CA 94035-1000) 

Sound radiation from a circular duct is a classic problem. Exact solu- 
tions were previously reported for the case of a negligibly thin duct wall, 
using the Wiener-Hopf technique. Despite the elegance of its closed-form 
solutions, the numerical presentations have been limited to mere demon- 
strations of its capability. A computer program is not publicly available for 
its numerical evaluation. Such a numerical evaluation has recently become 
increasingly in demand not only for its own merits, but also for cross 
examination of the numerical results of computational aeroacoustics. 
These techniques are just starting to attract widespread attention as a po- 
tential tool in attacking important but unsolved aeroacoustic problems. 
This paper presents a comprehensive mathematical procedure for the nu- 
merical evaluation primarily for use in studies of noise emission of aircraft 
engines. Various mean flows are included for simulations of an aircraft jet 
engine exhaust and inlet, and aircraft cruise condition. Unlike previously 
published reports, this paper will include radiation of nonpropagating 
modes as well as propagating modes. 

11:15 

laNS5. Lord Rayleigh revisited: Can vortices in tubes be sources of 
noise? James B. Lee (Concert Acoustics, R O. Box 80571, Portland, OR 
97205) 

Energy can flow through tubes filled with air governed by the laws of 
fluid dynamics, even in the violent form of shock waves. Energy also can 
flow through such systems as oscillations of small amplitude, which are 
governed by the equations of linear acoustics. What happens in the regime 



J. Acoust. Soc. Am., Vol. 100, No. 4, Pt. 2, Oct. 1996 3rd Joint Meeting: Acoustical Societies of America and Japan 2566 



Downloaded 1 1 Jul 2013 to 128.95.155.147. Redistribution subject to ASA license or copyright; see http://asadl.org/terms 




between? Calculations performed upon complaint of excessive low- 
frequency noise during alternate blasting for two long parallel railway 
tunnels indicated that particle velocities were a substantial fraction of the 
speed of sound at the tunnels’ mouths — between the regimes of shock 
waves and linear resonances. A Strouhal calculation suggested that vortices 
were the likely source: A sequence of counter-rotating vortices was being 
propelled out of the tunnels’ mouths, generating a powerful monopole 
source at 12 Hz. Modern noise consultants were baffled, but Lord Rayleigh 
had solved the system a century before, as a compound nonlinear eigen- 
value problem, necessarily taking viscosity into account. The blasting en- 
gineers drove small cross-bores between the tunnels, sucking circulating 
fluid out of the vortices, reducing power, and quieting neighbors. The 
experiment should be repeated at a smaller scale with better instruments 
under controlled conditions. 

11:30 

laNS6. Sound energy distribution at an intersection of an 
underground tunnel: Additional small-scale experiments. Hiroyuki 
Imaizumi, Sunao Kunimatsu, and Takehiro Isei (Safety Eng. Dept., Natl. 
Inst, for Resources and Environment, 16-3 Onogawa, Tsukuba, Ibaraki, 
305 Japan) 

In studies on some interactions between sound propagation character- 
istics and environmental factors underground, small-scale experiments on 
the sound energy distribution at an intersection of tunnels have been car- 
ried out in an anechoic room. The small-scale tunnels were made by 
acrylic plates, and the original surfaces of inner walls were assumed to 
have an acoustically hard condition, while surfaces of the walls covered by 
flannel were acoustically soft. Impulsive sound was generated electrically 
as a sound source was applied, and propagated sounds were measured by 



an omnidirectional microphone. The experiments were carried out under 
several kinds of conditions for angles at the intersection and positions of 
the sound source. Influences of intersection on the sound energy distribu- 
tion were indicated by comparisons between the measuring points before 
and behind the intersection with different angles. In addition, the influ- 
ences of the acoustical characteristics of the inner walls and positions of 
the source are also presented. 



11:45 

laNS7. Effects of perforated pipe on the higher-order modes in an 
elliptical chamber. Tatsuyu Ikeda, Tsuyoshi Nishimura (Kumamoto 
Inst, of Technol., 4-22-1 Ikeda, Kumamoto, 860 Japan), Tsuyoshi 
Usagawa, and Masanao Ebata (Kumamoto Univ., Kumamoto, 860 Japan) 

The role of the perforated pipe in a simple elliptical expansion cham- 
ber, especially its ability to improve noise reduction by decreasing the level 
of resonance caused by traverse waves, is presented. The characteristics of 
the perforated pipe have been studied by means of impedance which de- 
pends on the shape and porosity of a pipe. The following phenomena 
regarding sound-pressure distribution when the perforated pipe is located 
at the center of a simple elliptical expansion chamber have been observed. 
(1) The resonance frequencies of higher-order modes shifted to the lower 
frequency ranges in comparison with those when a pipe was not attached, 
and (2) the resonance frequencies inside a perforated pipe are similar to the 
ones outside; however, the resonance level inside is higher than the one 
outside. In addition, an arrangement of the output pipe is also studied 
based on the distribution of higher-order modes in a chamber. The experi- 
mental results which tend to prove the effectiveness of the perforated pipe 
are concretized by a mathematical formula in the paper. 



MONDAY MORNING, 2 DECEMBER 1996 WAIALUA ROOM, 10:15 TO 11:45 A.M. 



Session laPA 



Physical Acoustics: Nonlinear Acoustics I: Propagation in Solids 

James A. TenCate, Cochair 

Los Alamos National Laboratory, EES-4, MS D443, Los Alamos, New Mexico 87545 

Akira Nakamura, Cochair 

Department of Electrical Engineering, Fukui Institute of Technology, 3-6-1 Gakuen, Fukui, 910 Japan 

Contributed Papers 



10:15 

laPAl. Nonlinear surface wave propagation in a piezoelectric 
material. M. F. Hamilton, Yu. A. Il’inskii, and E. A. Zabolotskaya 
(Dept, of Mech. Eng., Univ. of Texas, Austin, TX 78712-1063) 

Model equations were derived from first principles for nonlinear sur- 
face wave propagation in a piezoelectric material. There is no dependence 
on ad hoc or empirical parameters. The present work extends an earlier 
analysis of surface waves in crystals, described by the authors at the pre- 
vious meeting [J. Acoust. Soc. Am. 99, 2538(A) (1996)]. As in the earlier 
work, the model equations account for crystals having arbitrary symmetry, 
and for surface wave propagation in arbitrary directions in planes having 
arbitrary orientations with respect to the crystalographic axes. Here, elas- 
tic, piezoelectric, dielectric, and electrostrictive properties of the material 
are taken into account. The model is formulated as spectral evolution 
equations that are integrated numerically to illustrate the distortion of a 
finite amplitude surface wave that is sinusoidal at the source. The material 
we considered is LiNb0 3 , for which measured values of all required 
second- and third-order elastic constants are available in the literature [e.g., 
the latter are reported by Cho and Yamanouchi, J. Appl. Phys. 61, 875 



(1987)]. Analysis of the nonlinearity matrix permits identification of which 
physical effects contribute most to the distortion of nonlinear surface 
waves. [Work supported by ONR, NSF, and Schlumberger Foundation.] 



10:30 

laPA2. Second harmonic generation in a sound beam transmitted 
through an isotropic solid. B. J. Landsberger, M. F. Hamilton, Yu. A. 
Il’inskii, and E. A. Zabolotskaya (Dept, of Mech. Eng., Univ. of Texas, 
Austin, TX 78712-1063) 

An ultrasonic immersion procedure for determining third-order elastic 
coefficients of rock via measurements of harmonic generation was de- 
scribed recently by Plona et al. [ J. Acoust. Soc. Am. 98, 2886(A) (1995)]. 
Here an accurate quasilinear model is presented for second harmonic gen- 
eration in a sound beam transmitted through an isotropic solid immersed in 
liquid. With the primary beam represented as an angular spectrum, analytic 
solutions were derived for second harmonic generation by all pairs of plane 
waves in both the liquid and the solid. Numerical superposition of the 
analytic solutions, followed by Fourier transformation, yields the second 



2567 J. Acoust. Soc. Am., Vol. 100, No. 4, Pt. 2, Oct. 1996 3rd Joint Meeting: Acoustical Societies of America and Japan 2567 



Downloaded 1 1 Jul 2013 to 128.95.155.147. Redistribution subject to ASA license or copyright; see http://asadl.org/terms 



a MON. AM 



harmonic field anywhere in the solid or liquid. There are no restrictions on 
geometry or orientation of the sound source, and all combinations of com- 
pression and shear wave interactions are taken into account. At the previ- 
ous meeting the accuracy of this model for the case of reflection near the 
Rayleigh angle was demonstrated [J. Acoust. Soc. Am. 99, 2538(A) 
(1996)]. Reported here are comparisons with second harmonic diffraction 
patterns that were measured in a 1-MHz beam radiated by a 1.2-cm radius 
source and transmitted through a 10-cm- thick block of lucite immersed in 
water. [Work supported by ONR, NSF, and Schlumberger Foundation.] 

10:45 

laPA3. Comparative measurement of second-harmonic generation in 
various materials. Xiaoli Hou and Corinne Darvennes (Dept, of Mech. 
Eng., Tennessee Tech. Univ., Box 5014, Cookeville, TN 38505) 

Second-harmonic generation was measured in several man-made ma- 
terials for possible application of nonlinear properties to nondestructive 
testing. Samples included several thicknesses of two types of polymer 
matrix composites, three types of concretes, and plywood. Steel and alu- 
minum specimens were used as references and one of the composite 
samples was evaluated before and after fatigue cycles. A monochromatic 
ultrasonic signal was sent into each sample via a contact transducer placed 
on its top surface. The growth of the second harmonic was recorded, with 
a second transducer placed on the bottom face, as the amplitude of the 
input signal was gradually increased and for several values of the input 
frequency. Nonlinearity parameters could not be measured, due to the 
limitations of our equipment. Nonetheless, some interesting observations 
were made: (1) the two composites were much more nonlinear than the 
metals; (2) the concretes and the wood were extremely absorptive and an 
output signal was observed only at the lowest input frequency; and (3) 
fatigue cycles significantly increased the second harmonic, even though no 
damage was observed by C-scanning. [Work supported by NSF, TTU 
Manufacturing Center, and the FRG Program.] 

11:00 

laPA4. Observations of nonlinearity with slow dynamics in rocks. 

James A. TenCate, Thomas J. Shankland (EES-4 MS D443, Los Alamos 
Natl. Lab., Los Alamos, NM 87545), Paul A. Johnson (Los Alamos Natl. 
Lab. and Univ. Pierre et Marie Curie, 75252 Paris Cedex 05, France), and 
Bernard Zinszner (Inst. Fran 5 ais du Petrole, Rueil Malmaison Cedex, 
France) 

A typical resonance curve — measured acceleration versus drive 
frequency — made on a thin bar of rock shows peak bending with a soft- 
ening (nonlinear) modulus as drive levels are increased. Previous work 
showed the shapes of these nonlinear resonance curves depend on sweep 
rate, i.e., the “slow dynamics.” Slow dynamics in a 0.3-m-long, 50-mm- 
diam bar of Berea sandstone under ambient conditions have been docu- 
mented for the first time. Peak strain levels during the experiments ranged 
from 10 -11 to 10 -5 at a fundamental bar resonance frequency near 4 kHz. 
Slow dynamics begin to appear at strain amplitudes above 10 -6 at ambient 
conditions and at the onset of nonlinear peak bending. Higher strains 
condition the rock, altering its response for minutes to hours after the drive 
has been turned off. Other rocks show similar results. Physical origins of 



the slow dynamics lie in nonlinear effects at the microstructural level of 
cracks, pores, and interstitial clays. Further work examines environmental 
effects on conditioning and recovery as a means of relating them to physi- 
cal properties and microtexture of the rock. [Work supported by OBES/ 
DOE through the University of California.] 

11:15 

laPA5. Modeling nonlinearity and slow dynamics for rock in 
resonance experiments. Koen E. A. Van Den Abeele (EES-4 MS D443, 
Los Alamos Natl. Lab., Los Alamos, NM 87545) and Robert A. Guyer 
(Univ. of Massachusetts, Amherst, MA 01003) 

The presence of compliant features in rock causes nonlinear distortion 
of sound waves propagating through a sample, even at microstrain levels. 
When performing resonance experiments, one observes nonlinear behavior 
in different ways: resonant frequency shifts as function of drive amplitude, 
nonlinear dissipation including hysteresis, asymmetry of the resonance 
curves, and rich harmonic spectra. Recently, a new interesting feature has 
been added to these nonlinearity observations: the existence of a slow 
dynamics conditioning and recovering characteristic. Experimental evi- 
dence will be shown in a companion ASA contribution by TenCate et al. 
The focus is on the mathematical modeling of all of the nonlinear reso- 
nance observations, including the slow dynamics characteristics. The 
model uses a finite-difference time-domain solution of the one- 
dimensional wave equation with appropriate boundary conditions, and cal- 
culates waveforms along the sample. The key part of the model is the 
description of a modulus which depends on higher-order elastic constants, 
hysteresis strength, and its own weighted response over previous times. By 
integrating the history of the modulus using an exponential weighting 
function, interesting conditioning and recovering effects can be simulated 
which agree well with the nonlinear and slow time constant observations in 
rock. [Work supported by DOE/OBES/UCal.] 

11:30 

laPA6. Theory and feasibility of breather solitons in sandstone. 

Miguel Bernard and Bruce Denardo (Natl. Ctr. for Physical Acoust. and 
Dept, of Phys. and Astron., Univ. of Mississippi, University, MS 38677) 

A theoretical model and an experimental feasibility study of nonlinear 
Schrodinger breather solitons in a sandstone waveguide are presented. 
These solitons are acoustic waves that are theoretically characterized by a 
standing wave motion in the transverse direction, an exponential self- 
localization along the waveguide, and a speed of propagation that can have 
any value that is small compared to the speed of sound. The sole require- 
ment in the model is that transverse standing waves soften (the resonance 
frequency decreases as the amplitude is increased). The softening of a 
Berea sandstone sample and the resonance frequency quality factors have 
been measured in the range 500 Hz- 15 kHz. With the use of a composite 
transducer designed to optimize the drive amplitudes, frequency shifts of 
1% have been measured. Comparison of this shift with that due to non- 
uniformities and estimation of the soliton length and decay distance reveal 
that the observation of the solitons is feasible in Berea sandstone. This is 
a first step toward the observation of nonlinear Schrodinger acoustic 
breather solitons in a sandstone core. 



2568 



J. Acoust. Soc. Am., Vol. 100, No. 4, Pt. 2, Oct. 1996 



3rd Joint Meeting: Acoustical Societies of America and Japan 



2568 



Downloaded 1 1 Jul 2013 to 128.95.155.147. Redistribution subject to ASA license or copyright; see http://asadl.org/terms 




MONDAY MORNING, 2 DECEMBER 1996 



LANAI ROOM, 10:15 A.M. TO 12:15 RM. 



Session laSC 



Speech Communication: Lexical Factors in the Use of Spoken Language (Poster Session) 



Lynne E. Bernstein, Cochair 

Spoken Language Processing Laboratory, Department of Human Communication Sciences and Devices, House Ear Institute, 

2100 West Third Street, Los Angeles, California 90057 



Yasuo Ariki, Cochair 

Department of Electronics and Informatics, Ryukoku University, Seta, Otsu, Shiga, 520-21 Japan 



Contributed Papers 



All posters will be on display from 10:15 a.m. to 12:15 p.m. To allow contributors an opportunity to see other posters, contributors of 
odd-numbered papers will be at their posters from 10:15 a.m. to 11:15 a.m. and contributors of even-numbered papers will be at their 
posters from 11:15 a.m. to 12:15 p.m. 



laSCl. The role of within-category structure in the integration of 
auditory and visual speech. Michael D. Hall, Paula M. T. Smeele, 
and Patricia K. Kuhl (Dept, of Speech and Hearing Sci., CHDD, Box 
357920, Univ. of Washington, Seattle, WA 98195-7920) 

The influence of visual information on auditory speech perception can 
be observed under conditions where the two sources of information are 
discrepant. One demonstration involves viewing a face producing Pol while 
listening to a dubbed /g/ token, with participants reporting that they heard 
/bg/ (McGurk and MacDonald, 1976). This “combination” response re- 
flects the contribution of both modalities. Two experiments evaluated 
whether differences in the perceived quality of auditory stimuli within the 
/g/ category influence the incidence of combination responses. Synthetic 
VCV stimuli ranging from “good” /aga/ to “poor” /aga/ tokens were 
generated by factorially combining 6 levels of F 2, and 4 levels of F 3, 
onset frequency. In experiment 1 participants identified these auditory 
stimuli and rated them with respect to goodness as a /g/. Goodness was 
found to be correlated with, but not completely predicted by, consonant 
identification. In experiment 2 these stimuli were separately dubbed with a 
visual /aga/ (“matched”) and /aba/ (“mismatched,” which should evoke 
combination responses). Results will be discussed in terms of the suffi- 
ciency of consonant identification and category goodness in predicting the 
probability of combination responses. These data will be used to address 
models of auditory-visual speech integration. [Work supported by 
NICHD.] 



laSC2. Audiovisual integration of speech based on minimal visual 
information. D. H. Whalen (Haskins Labs., 270 Crown St., New 
Haven, CT 06511), Julia Irwin (Haskins Labs., New Haven, CT 06511 
and Univ. of Connecticut), and Carol A. Fowler (Haskins Labs., New 
Haven, CT 06511, Yale Univ., and Univ. of Connecticut) 

Two competing theories have been proposed to explain the fact that 
vision can dominate over audition in syllables that have been spliced so 
that the two modalities specify different phonemes [McGurk and Mc- 
Donald, Nature 263, 746-748 (1976)] . The first theory states that acoustic 
and visual information are combined in varying proportions depending on 
how strong the information is in each signal. The second proposes that the 
visual signal has linguistic value because speech gestures can be conveyed 
visually, and that these gestures are the primitives of speech perception for 
every modality. The present experiment contrasts dynamic and static visual 
information by reducing the visual signal to two or three video frames, 
synchronized with the speech in the appropriate location. Dynamic stimuli 
had at least two frames showing movement of the mouth, while static ones 



had a single frame, taken from the consonant closure, repeated to make a 
three frame visual image. Even these brief images were enough to elicit 
speech percepts that matched the visual image. The dynamic and static 
images were equally effective, suggesting revisions in both theories. [Work 
supported by NIH Grant No. HD-01994.] 



laSC3. Relationships between word knowledge and visual speech 
perception. I. Subjective estimates of word age of acquisition. 

Edward T. Auer, Jr., Robin S. Waldstein, Paula E. Tucker, and Lynne E. 
Bernstein (Spoken Lang. Processes Lab., Human Commun. Sci. and 
Devices Dept., House Ear Inst., 2100 W. Third St., Los Angeles, CA 
90057) 

In individuals with normal hearing, words estimated to be learned 
earlier are recognized more rapidly than words estimated to be learned 
later. To investigate how word knowledge is related to lipreading profi- 
ciency, word age-of-acquisition (AOA) estimates were obtained from 50 
hearing (H) and 50 deaf (D) (80-dB HL pure-tone average or greater 
hearing losses acquired before the age of 48 months) adults. Participants 
judged AOA for the 175 words in Form M of the Peabody Picture Vocabu- 
lary Test-Revised using an 11 -point scale, and responded whether the 
words were acquired through speech, sign language, or orthography. The 
two groups differed in when (mean AOA: H = 8.9 years, D = 10.6 years) 
and how (H=69% speech and 31% orthography; D=38% speech, 45% 
orthography, and 17% sign language) words were judged to be acquired. 
However, item analyses revealed that the relative acquisition order was 
essentially identical across groups (r= 0.965). Interestingly, within the 
deaf group, better lipreaders estimated that more words had been learned 
through speech than orthography. An implication of these results is that 
learning words primarily through orthography does not support highly 
accurate spoken language processing. [Work supported by NIH Grant No. 
DC00695.] 



laSC4. Relationships between word knowledge and visual speech 
perception. II. Subjective ratings of word familiarity. Robin S. 
Waldstein, Edward T. Auer, Jr., Paula E. Tucker, and Lynne E. Bernstein 
(Human Commun. Sci. and Devices Dept., House Ear Inst., 2100 W. Third 
St., Los Angeles, CA 90057) 

Word familiarity is an important factor in word recognition and lexical 
access for hearing individuals. Subjective word familiarity ratings are hy- 
pothesized to reflect experience with words irrespective of the modality 



2569 J. Acoust. Soc. Am., Vol. 100, No. 4, Pt. 2, Oct. 1996 3rd Joint Meeting: Acoustical Societies of America and Japan 2569 



Downloaded 1 1 Jul 2013 to 128.95.155.147. Redistribution subject to ASA license or copyright; see http://asadl.org/terms 



a MON. AM 



(i.e., spoken or written) through which exposure has taken place, and to 
provide an estimate of the size of the mental lexicon. To investigate how 
word familiarity is related to lipreading proficiency, 450 printed words 
were presented for rating on a seven-point scale to 50 deaf and 50 hearing 
participants. Preliminary results revealed that the deaf participants pro- 
duced lower mean familiarity ratings than did the hearing participants, for 
high-, medium-, and low-familiarity words (hearing means = 6.7, 4.8, 3.0; 
deaf means = 6.0, 3.8, 2.6). Among the deaf participants, correlations be- 
tween established familiarity ratings and individuals’ ratings were reliably 
higher for excellent than for good lipreaders, a possible indication that 
perceptual experience influences the structure of the lexicon. At the same 
time, the performance of the excellent lipreaders provides support for the 
hypothesis that lexical organization does not depend on the perceptual 
input modality (i.e., vision versus hearing). [Work supported by NIH Grant 
No. DC00695.] 



laSC5. Speechreading talking faces. Dominic W. Massaro, 
Christopher S. Campbell, Michael A. Berger, and Michael M. Cohen 
(Dept, of Psych., Univ. of California, Santa Cruz, CA 95064) 

It is now well established that visible speech is an important source of 
information in face-to-face communication. Given the valuable role of 
audible speech synthesis in experimental, theoretical, and applied arenas, 
visible speech synthesis has been developed. Research has shown that the 
talking head closely resembles real heads in the quality of its speech and its 
realism (when texture mapping is used). The talking head can be heard, 
communicates paralinguistic as well as linguistic information, and is con- 
trolled by a text-to-speech system. Several sources of evidence are pre- 
sented which show that visible speech perception (speechreading) is fairly 
robust across various forms of degradation. Speechreading remains fairly 
accurate even when the mouth is viewed in noncentral vision; eliminating 
and distorting high-spatial frequency information does not completely dis- 
rupt speechreading; and speechreading is possible when additional visual 
information is simultaneously being used to recognize the speech input. 
The results are consistent with the fuzzy logical model of perception in 
which multiple sources of information are used to recognize patterns. Vari- 
ous visible feature sets are tested within the framework of the model to 
determine which visible features are functional in speechreading. Demon- 
strations of the talking head and various psychological phenomena will be 
provided. [Work supported by NIDCD.] 



laSC6. Sensitivity of cued speech reception to cue imperfections. 

Maroula S. Bratakos and Louis D. Braida (Res. Lab. of Electron., MIT, 
Cambridge, MA 02139) 

In manual cued speech (MCS) a speaker gestures with his/her hand to 
resolve ambiguities among speech elements that are often confused by 
speechreaders. The shape of the hand distinguishes among consonants, and 
the position of the hand relative to the face distinguishes among vowels. 
Experienced receivers of MCS achieve nearly perfect reception of every- 
day connected speech. To understand the benefits that might be derived 
from the imperfect cues produced by an automatic cueing system, video- 
taped sentences with handshapes corresponding to the phones identified by 
simulated phonetic speech recognizers were dubbed. The cues dubbed on 
these sentences were discrete in both shape and position rather than fluidly 
articulated, and the speaking rate was roughly 50% faster than for MCS. 
When the phones identified by an ideal recognizer were used to produce 
the cues, performance was only slightly lower than for MCS. When cues 
were derived from an existing recognizer, intelligibility was reduced, but 
substantial benefits to speechreading were observed. Current research is 
aimed at developing an automatic speech recognition system with the 
speed, accuracy, and computational efficiency required for a real-time au- 
tomatic cueing system. [Work supported by NIH.] 



laSC7. Hemispheric differences in perceiving and integrating 
dynamic visual speech information. Jennifer A. Johnson and 
Lawrence D. Rosenblum (Dept, of Psych., Univ. of California, Riverside, 
CA 92521) 

There is evidence for a left-visual-field/right-hemisphere (LVF/RH) 
advantage for speechreading static faces [R. Campbell, Brain & Cognit. 5, 
1-21 (1986)] and a right- visual-field/left-hemisphere (RVF/LH) advantage 
for speechreading dynamic faces [P. M. Smeele, NATO ASI Workshop 
(1995)]. However, there is also evidence for a LVF/RH advantage when 
integrating dynamic visual speech with auditory speech [e.g., E. Diesch, Q. 
J. Exp. Psychol.: Human Exp. Psychol. 48, 320-333 (1995)]. To test rela- 
tive hemispheric differences and the role of dynamic information, static, 
dynamic, and point-light visual speech stimuli were implemented for both 
speechreading and audio-visual integration tasks. Point-light stimuli are 
thought to retain only dynamic visual speech information [L. D. Rosen- 
blum and H. M. Saldana, J. Exp. Psychol.: Human Percept. Perform. 22, 
318-331 (1996)]. For both the speechreading and audio-visual integration 
tasks, a LVF/RH advantage was observed for the static stimuli, and a 
RVF/LH advantage was found for the dynamic and point-light stimuli. In 
addition, the relative RVF/LH advantage was greater with the point-light 
stimuli implicating greater relative LH involvement for dynamic speech 
information. 



laSC8. Face identification using visual speech information. Deborah 
A. Yakel and Lawrence D. Rosenblum (Dept, of Psych., Univ. of 
California, Riverside, CA 92521) 

Traditionally, the recovery of linguistic message and speaker identity is 
thought to involve distinct operations and information. However, recent 
observations with auditory speech show a contingency of speech percep- 
tion on speaker identification/familiarity [e.g., Nygaard et al., Psychol. 
Sci. 5, 42-46 (1994)]. Remez and his colleagues [Remez et al., J. Exp. 
Psychol, (in press)] have provided evidence that these contingencies could 
be based on the use of common phonetic information for both operations. 
In order to examine whether common information might also be useful for 
face and visual speech recovery, point-like visual speech stimuli were 
implemented which provide phonetic information without containing fa- 
cial features [L. D. Rosenblum and H. M. Saldana, J. Exp. Psychol.: Hu- 
man Percept. Perform. 22, 318-331 (1996)]. A 2 AFC procedure was used 
to determine if observers could match speaking point-light faces to the 
same fully illuminated speaking face. Results revealed that dynamic point- 
light displays afforded high face matching accuracy which was signifi- 
cantly greater than accuracy with frozen point-light displays. These results 
suggest that dynamic speech information can be used for both visual 
speech and face recognition. 



laSC9. The effect of utterance duration on visual-speech 
intelligibility scores. Jean-Pierre Gagne and Lina Boutin (Ecole 
d’orthophonie et d’audiologie, Univ. de Montreal, Montreal, PQ H3C 3J7, 
Canada) 

The effect of duration on the visual-speech intelligibility of talkers was 
investigated. The stimulus set consisted of 25 sentences. Each sentence had 
the same grammatical structure and contained three critical elements (sub- 
ject, verb, object). Seven talkers were videotaped while they spoke a list of 
sentences twice using conversational and clear speech. For each talker, the 
mean duration of the conversational and clear sentences were measured. 
The recordings were digitized and dubbed under four conditions: (1) nor- 
mal conversational; (2) conversational speech decelerated to a speed 
equivalent to the duration of the talker’s average utterances of clear 
speech; (3) natural clear speech; (4) clear speech accelerated to a speed 
equivalent to the talker’s average conversational speech. The test sentences 
(25 sentences X 2 iterations X 4 durations X 7 talkers) were randomized 
and shown (without sound) to a group of 18 subjects. The responses ob- 
tained during the perceptual task were used to determine each talker’s 
speech intelligibility for the four experimental conditions. The results re- 



2570 J. Acoust. Soc. Am., Vol. 100, No. 4, Pt. 2, Oct. 1996 3rd Joint Meeting: Acoustical Societies of America and Japan 2570 



Downloaded 1 1 Jul 2013 to 128.95.155.147. Redistribution subject to ASA license or copyright; see http://asadl.org/terms 




vealed a significant effect for sentence duration and talker, as well as a 
significant sentence-duration x talker interaction. The effects of sentence 
duration on visual-speech intelligibility will be discussed. [Work supported 
by NSERC.] 



laSClO. Cross-modal context effects on the perception of /r/ and /l/ 
in a speech and nonspeech mode. Kerry P. Green and Linda W. 
Norrix (Ctr. for Neurogenic and Commun. Disord., Univ. of Arizona, 
Tucson, AZ 85721) 

Norrix and Green [J. Acoust. Soc. Am. 99, 2591-2592 (1996)] pro- 
vided evidence for cross-modal context effects on the perception of /r/ and 
/l/ in a stop cluster. Tokens from a synthetic /iri-ili/ continuum were 
dubbed onto a visual /ibi/. When presented in an auditory-visual (AV) 
condition, the tokens were perceived as ranging from /ibri/ to /ibli/. Results 
indicated a reliable shift in the AV condition relative to an auditory-only 
(AO) condition. This shift was in accord with acoustic consequences of 
articulating /r/ and /l/ in a stop cluster. In the current study, sine-wave 
analogs of the /iri-ili/ tokens were constructed and presented to two 
groups of observers in an AO and AV condition. Group One was told they 
would hear schematic speech sounds and instructed to identify what they 
heard as /r/ or /l/. Group Two made up their own criteria for classifying the 
tokens as nonspeech sounds. Results indicated a reliable shift in the /r— 1/ 
boundary between the AO and AV conditions for the speech group only 
and suggest that the influence of the visual articulatory context depends 
upon listeners interpreting the auditory tokens as speech. [Work supported 
by NIDCD, NIH.] 



laSCll. Effect of sign complexity on speech timing in simultaneous 
communication. Robert L. Whitehead (Appl. Lang, and Cognition Res., 
Natl. Tech. Inst, for the Deaf, 52 Lomb Memorial Dr., Rochester, NY 
14623-5604), Nicholas Schiavetti (State Univ. of New York, Geneseo, 
NY 14454), Brenda Whitehead (Natl. Tech. Inst, for the Deaf, Rochester, 
NY 14623), and Dale Evan Metz (State Univ. of New York, Geneseo, NY 
14454) 

Simultaneous communication combines spoken English with manual 
representations of English words by signs and fingerspelling. The purpose 
of this investigation was to study the effect of sign complexity on temporal 
features of speech during simultaneous communication (SC). The effects 
of three independent variables: (a) mode (speech only versus SC); (b) sign 
complexity (base versus elaborated signs); and (c) type of sign movement 
(kinetic versus morphokinetic) were studied on five dependent variables: 
(a) word duration, (b) sentence duration, (c) diphthong duration, (d) 
interword-interval before signed experimental word (IWIB), and (e) 
interword-interval after signed experimental word (IWIA). Audio record- 
ings were made of 12 normal-hearing, experienced sign language users 
speaking experimental words that varied in sign complexity and movement 
under SC and speech only (SO) conditions. Results indicated longer sen- 
tence durations for SC than SO and longer anticipatory durations of IWIB 
and diphthong before signed words, especially those using more complex 
signs. IWIA only lengthened for SC vs SO with no further effect of sign 
complexity. These results indicate a finite effect of sign complexity on 
pause and segment durations before the sign but not as strong an effect as 
has been reported for increased fingerspelling complexity. 



laSC12. Synthesizing audiovisual speech from physiological signals. 

Eric Vatikiotis-Bateson and Hani Yehia (ATR Human Information 
Processing Res. Labs., 2-2 Hikaridai, Seika-cho, Kyoto, 619-02 Japan) 

Previous examination of perceiver eye movement behavior during au- 
diovisual speech tasks has shown that linguistically relevant visual infor- 
mation is distributed over large regions of the face [Vatikiotis-Bateson 
et al . , ICSLP-94 (1994)]. Furthermore, the simultaneous production of 
facial and vocal tract deformations suggests a single source of control for 
acoustic and visual components of speech production. To further examine 



this possibility, orofacial motion during speech has been correlated with 
perioral muscle activity, the time-synchronous behavior of vocal tract ar- 
ticulators, and elements of the speech acoustics (e.g., rms amplitude) using 
both linear and nonlinear modeling techniques. Not surprisingly, since 
small motions require small forces, linear techniques such as minimum 
mean square error and second order autoregression provide reasonably 
good estimates of the inherently nonlinear mapping between muscle EMG 
and orofacial kinematics. This paper assesses the relative merits of such 
linear models versus nonlinear, neural network estimations of the orofacial 
dynamics. 



laSC13. Neural processes of audio-visual speech perception. 

Satoshi Imaizumi, Koichi Mori, Shigeru Kiritani, Masato Yumoto (RILP, 
Univ. Tokyo, Bunkyo-ku, Tokyo, 113 Japan), and Hideaki Seki (Chiba 
Inst. Tech., Japan) 

Neural processes related to audio-visual speech perception were in- 
vestigated by measuring the mismatch magnetic fields (MMF), which re- 
flect a neural activity detecting deviant stimuli randomly inserted in a 
stream of rapidly repeating frequent stimuli. Three audio-visual stimuli 
were used: AbVg (audio signal /ba/ with discrepant visual signal /ga/), 
AbVb, and AdVd. MMF were measured from the left hemisphere of 16 
normal hearing subjects using a 37 ch SQUID magnetometer. The rate of 
non-/ba/ responses to AbVg (McGurk fusion effect) varied depending on 
the stimulus condition and subjects. It was 80% when AbVg was the 
frequent, but was 45% when AbVg was the deviant. Significant MMF-like 
fields were excited in the auditory cortex by deviant AbVg embedded in 
frequent AbVb. The subjects with a low-fusion rate had larger MMF-like 
fields than those with a high rate. These results suggest that the auditory 
mismatch detection process is affected by visual signal, and phonetic cat- 
egorization is affected by a module which can either fuse or dissociate 
audio-visual information. 



laSC14. Lexical access in French: Recognizing “identical” phrases. 

Robert Bannert, Pascale Nicolas, and Monika Stridfeldt (Umea Univ., 
Dept, of Phonet., S-901 87 Umea, Sweden) 

Information on word recognition and lexical access has been domi- 
nated by research on English. In French, with its different prosodic coding, 
showing the phonological processes of liaison, enchainement, and 
e -deletion, represents an opportunity to widen the understanding of speech 
recognition. A listening test was carried out containing 20 utterances and 
ten distractors ranging from one to six syllables and forming seven pairs 
and two triplets of supposedly “identical” linguistic phrases. A represen- 
tative sample of each type produced by a male and a female French 
speaker was presented five times in random order to 1 8 native listeners of 
French (including the two speakers) and nine Swedish learners of French 
as a foreign language. No group identified any of the original utterances 
correctly, and only two pairs of stimuli were identified at random. Taking 
into account the frequencies of the stimuli, it appears that most stimuli 
contain some acoustic information that guides recognition. An acoustic 
analysis of these utterances showed prosodic and, what might be consid- 
ered unexpected, according to the literature, segmental differences in spec- 
trum amplitude and F 0. [Work supported by the Swedish Council for 
Research in the Humanities and Social Sciences.] 



laSC15. “SLIP-ing” in phonologically similar neighborhoods. 

Michael S. Vitevitch (Dept, of Psychol., Park Hall, SUNY, Buffalo, NY 
14260-4110) 

Phonological speech errors are the class of speech errors in which one 
or more phonemes are added, deleted, substituted, or reversed. Previous 
work [Dell (1990); Stemberger and MacWhinney (1986)] has found that 
low-frequency words are more likely to be involved in phonological 
speech errors than high-frequency ones. If word frequency, influences per- 
formance in speech production, do other characteristics of words, such as 



2571 J. Acoust. Soc. Am., Vol. 100, No. 4, Pt. 2, Oct. 1996 3rd Joint Meeting: Acoustical Societies of America and Japan 2571 



Downloaded 1 1 Jul 2013 to 128.95.155.147. Redistribution subject to ASA license or copyright; see http://asadl.org/terms 



neighborhood density or neighborhood frequency, influence speech pro- 
duction? The influence of phonologically similar neighborhoods on a type 
of whole word speech error (malapropisms) has previously been demon- 
strated [Vitevitch (1996)]. The present research examined the influence of 
phonologically similar neighborhoods on phonological speech errors. Us- 
ing the SLIPS technique [Motley and Baars (1976)], phonological speech 
errors were elicited from participants with real word pairs that varied in 
their neighborhood characteristics. The results show that low-frequency 
words “slipped” more often than high-frequency ones, replicating past 
results from error corpora and from other studies using this methodology. 
Moreover, words with sparse phonologically similar neighborhoods tended 
to “slip” more than words with dense phonologically similar neighbor- 
hoods. The implications for these findings on models of speech production, 
speech perception, and lexical representation are discussed. 



laSC16. Effects of phonotactic probability on segmentation of words 
in continuous speech. Daniel E. Gaygen and Paul A. Luce (Lang. 
Percept. Lab., Dept, of Psych, and Ctr. for Cognit. Sci., State Univ. of New 
York, Park Hall, Buffalo, NY 14260-4110) 

The role of phonotactic probability in the segmentation of spoken 
words in continuous speech was investigated. Participants made speeded 
word detection responses to sequences of spoken stimuli composed of 
target words preceded and followed by non words (i.e., NONWORD- 
TARGET WORD-NONWORD). Speed and accuracy of detection were 
examined as a function of the nonword contexts of the target words. In 
particular, probabilities of segmental transitions from the nonword con- 
texts to the target words were manipulated: Pairs of segments composed of 
the last segment of the preceding nonword and the first segment of the 
target word — as well as pairs composed of the last segment of the target 
word and the first segment of the following nonword — were varied in 
terms of intraword transition probability (HIGH and NONE) and transition 
type (CC, CV, VC, and VV). Both intraword co-occurrence probabilities 
and transitional probabilities were manipulated. The implications of the 
results for the use of phonotactic probabilities in the identification of words 
in fluent speech will be discussed. 



laSC17. Frequency effects in malpropisms. William Raymond and 
Alan Bell (Dept, of Linguist., Univ. of Colorado, Boulder, CO 80309) 

Malapropisms are lexical substitution errors resulting from a failure at 
the stage of accessing the phonological form corresponding to a lemma — a 
semantically and structurally specified lexical entry [Fay and Cutler 
(1977); Garrett (1980)]. An example is tentative for tenable. Since they 
occur naturally in utterance contexts, malapropisms are an important 
source of information about this stage of lexical access. A study of over 
300 malapropisms confirmed prior findings that errors resemble targets 
closely in phonological form and in syntactic category. In addition, it was 
found that errors resemble targets in derivational morphology (indepen- 
dently of phonological similarity), and that there is a strong correlation 
(r 2 = 0.34) of the text frequencies of the error and target words. The partial 
correlations of frequency with word length, syntactic category, and seg- 
mental similarity only account for one-half of this correlation (residual 
r 2 about 0.17). Models which treat frequency effects as biases on the 
decision processes of selecting word forms [Luce et al. (1990)] or as form 
activation thresholds [Jescheniak et al. (1994)] cannot account for such a 
correlation directly. The results thus appear to support some degree of 
organization of the lexicon according to frequency, as suggested, e.g., by 
Forster (1976). 



laSC18. The role of time during lexical access. Arthur G. Samuel 
(Dept, of Psych., SUNY, Stony Brook, NY 11794-2500) 

Speech naturally occurs over time — words unfold from “left-to-right.” 
This structure inherently confounds two logically distinct factors: the 
amount of phonetic information presented, and the amount of time the 



perceptual process has been working. For example, in the word “acousti- 
cal,” both the elapsed processing time and the number of processed pho- 
nemes (or syllables) would be greater for the second Ik/ (“c”) than for the 
first one. The current study attempts to unconfound these two factors by 
using speech compression and expansion techniques, coupled with phone- 
mic restoration methodology. The strength of phonemic restoration can be 
used as an index of the strength of lexical activation. For example, resto- 
ration is stronger for word-final phonemes than for phonemes in earlier 
word positions, due to the higher lexical activation later in words. The 
current study tests phonemic restoration when available processing time is 
manipulated via speech compression/expansion. An interesting additional 
variable that appears to interact with time is the active lexical cohort size: 
Additional processing time seems to be more useful when the cohort has 
been narrowed to a small set. The results support and constrain activation 
models of lexical access. [Work supported by NIMH.] 



laSC19. Sources of variability as linguistically relevant aspects of 
speech. Lynne C. Nygaard and S. Alexandra Burt (Dept, of Psych., 
Emory Univ., Atlanta, GA 30322) 

Recent research has suggested that only linguistically relevant sources 
of variability, or those which affect the perception of spoken words, are 
retained in long-term memory. The present study sought to determine if the 
linguistic relevance of surface characteristics such as speaking rate, overall 
amplitude, and vocal effort would differentially affect memory retention. 
For each source of variability, a continuous recognition memory task was 
used in which listeners were presented with a list of spoken words and 
asked to judge whether each word in the list was “old” (had occurred 
previously on the list) or “new.” Results showed that listeners were better 
able to identify a word as “old” if the word was repeated at the same 
speaking rate (condition 1), overall amplitude (condition 2), or vocal effort 
(condition 3), suggesting that the individual surface characteristics which 
comprise each type of variability were encoded into memory. However, 
speaking rate and vocal effort produced a greater effect on recognition 
memory than overall amplitude, indicating that all sources of variability 
may not be processed and encoded to the same extent. Rather, memory for 
surface characteristics may be related to the amount of linguistically rel- 
evant information each source of variability contains. 



laSC20. Spoken word recognition in older adults: Activation and 
decision. Jan Charles-Luce (Dept, of CDS and Ctr. for Cognit. Sci., 
Univ. at Buffalo, Buffalo, NY 14260) and Paul A. Luce (Univ. at Buffalo, 
Buffalo, NY 14260) 

Spoken word recognition may be characterized by two successive 
stages: (1) activation of multiple form-based representations in memory 
and (2) frequency-biased perceptual decision. Recent research has sug- 
gested that older adults show deficits in controlled processing in perceptual 
and memory tasks, suggesting that while activation mechanisms subserv- 
ing spoken word recognition may remain relatively intact over time, per- 
ceptual decision processes degrade. In order to investigate the possible loci 
of older adults’ recognition difficulties, perception of specially selected 
words that orthogonally varied on three dimensions was examined: word 
frequency, neighborhood density, and neighborhood frequency. Effects of 
neighborhood density are typically associated with activation mechanisms, 
whereas neighborhood frequency is associated with perceptual decision. 
The implications of these results for theories of aging and spoken word 
recognition will be discussed. [Work supported by grants from NIDCD.] 



laSC21. Automatic generation of word models using piecewise linear 
segment lattices. Hiroaki Kojima and Kazuyo Tanaka (Electrotechnical 
Lab., 1-1-4 Umizono, Tsukuba, Ibaraki, 305 Japan) 

A framework for “phonological concept formation” has been pro- 
posed, aiming to generate robust speech recognition models [Kojima et al., 
Proc. ICSLP 92, Vol. 1, pp. 269-272 (1992)]. For this purpose, a “piece- 



2572 J. Acoust. Soc. Am., Vol. 100, No. 4, Pt. 2, Oct. 1996 3rd Joint Meeting: Acoustical Societies of America and Japan 2572 



Downloaded 1 1 Jul 2013 to 128.95.155.147. Redistribution subject to ASA license or copyright; see http://asadl.org/terms 




wise linear segment lattice” model is proposed. The structure is repre- 
sented as a lattice of segments, each of which is represented as regression 
coefficients of feature vectors within the segment. Compared with typical 
stochastic models like HMM, the advantages are: (1) It needs fewer 
samples to learn; (2) it represents objects in voluntary precision; and (3) its 
structure can be dynamically changed by less calculation. An outline of the 
generation algorithm is as follows: (1) Dividing each sample into segments 
using DP, where the number of segments is decided based on an MDL-like 
criterion; (2) matching between the sequences of segments within the same 
word by DP; (3) modifying the division according to their matching scores; 
(4) picking up similar (i.e., near) subsequences and gathering them into a 
phonelike cluster. Speaker-independent isolated word recognition is car- 
ried out using the proposed models which are generated in several condi- 
tions. The results show that the recognition rate is improved by forming 
phonelike clusters. 



laSC22. Young children’s and older adults’ sensitivity to semantic 
and pragmatic context in speech production. Jan Charles-Luce and 
Kelly M. Dressier (Dept, of Commun. Disord, and Sci. and Ctr. for 
Cognit. Sci., Univ. at Buffalo, Buffalo, NY 14260) 

It is generally expected that speakers of American English neutralize 
the /t/-/d/ voice contrast in words like “writer” and “rider” and instead 
produce them as homonyms. However, it has been demonstrated that 
young adults adjust their articulation depending on context and do not 
always neutralize this contrast. In particular, young adults preserve the 
voice contrast in semantically biasing contexts and when speaking for a 
listener. In the present investigation, the interest was in determining when 
young children became sensitive to semantic and pragmatic context and, 
consequently, when they adjusted their articulation in ways similar to 
young adults. Moreover, there was interest in older adults’ sensitivity to 
context. Young children (ages 7-12), young adults (college age), and older 
adults (ages 60-80) produced minimal pairs containing voiced and voice- 
less intervocalic alveolar stops in two semantic contexts (biasing and neu- 
tral) and in two pragmatic contexts (listener-present and -absent). The 
results showed developmental changes in speakers’ sensitivity to semantic 
and pragmatic contexts. These results will be discussed in terms of inter- 
active activation, involving the structure and organization of a speaker’s 
linguistic system, and pragmatic compensation, involving a speaker’s cog- 
nitive decision processes to adjust articulation for the listener’s benefit. 
[Work supported by NIH.] 



laSC23. The role of lexical access in spontaneous speech 
disfluencies. Gerald W. McRoberts and Herbert H. Clark (Dept, of 
Psych., Stanford Univ., Stanford, CA 94305) 

Pauses and hesitations in spontaneous speech are assumed to result 
from problems in various aspects of sentence planning. A causal role for 
difficulties with lexical access is suggested by theoretical accounts of 
speech production [Levelt, Speaking (1989)] and empirical studies show- 
ing that pauses are more likely before rare than common words in spon- 
taneous speech [Maclay and Osgood (1959)]. In the present study, word 
frequency was manipulated in a picture-naming task in which speakers 
produced the names of ten high- and ten low-frequency pairs of standard- 
ized line drawings within a standard sentence frame (e.g., There is a snail 
to the left of the harp.). The mean frequency of occurrence for high- and 
low-frequency items was 171.5 (range: 50-591) and 3.5 (range: 1-8) per 
million, respectively [H. Kucera and W. N. Francis, A Computational 
Analysis of Present-day English (1967)]. Analyses indicate that low- 
frequency pairs resulted in: (1) more pauses and word substitutions and (2) 
longer latencies to begin speaking. 



laSC24. An exploration of listener strategies in the lexical 
segmentation of hypokinetic dysarthric speech. Julie M. Liss, 
Stephanie von Berger (Dept, of Speech and Hear. Sci., Arizona State 
Univ., Box 871908, Tempe, AZ 85281), John Caviness, Charles Adler, 
and Brian Edwards (Mayo Clinic, Scottsdale, AZ 85259) 

This investigation examined listener transcriptions of phrases produced 
by speakers with mild to severe hypokinetic dysarthria to examine indi- 
vidual strategies for the identification of word boundaries in connected 
speech. It was hypothesized that the most efficient listeners would use 
syllabic strength information to guide their parsing of the continuous 
acoustic stream. Six-hundred transcribed phrases (10 listeners X60 
phrases) were coded independently by two judges to identify (1) correct 
word transcriptions, (2) evidence of accurate lexical parsing, regardless of 
exact word identification, and (3) the proportions of accurate parsing of 
strong and weak syllable word onsets. Linear regression analysis of the 
group data revealed that correct segmentation of strong syllable word on- 
sets predicted listener performance on segmenting weak syllable word 
onsets \R = 0.805, F( 1,29) = 5 1.440, /?<0.001]. Despite a wide range of 
listener performance on “words correct,” individual strategies for percep- 
tual segmentation were evident only in the transcriptions for the most 
severe speaker. In this case, the poorest listeners exhibited disproportionate 
difficulty with the segmentation weak syllable word onsets. Results sug- 
gest that a listener’s ability to use syllabic strength information in lexical 
parsing determines, in part, their ability to recognize word onsets. [Work 
supported by NIDCD, NIH.] 



laSC25. Preliminary report on syllable level organization observed 
in Parkinsonian speech obtained in individuals before and after 
posteroventral pallidotomy. Q. Emily Wang (Dept, of Commun. 
Disord, and Sci., Rush Univ., 1653 W. Congress Pkwy., Chicago, IL 
60612) and Kathleen Shannon (Rush-Presbyterian-St. Luke’s Medical 
Ctr., Chicago, IL 60612) 

Syllable level organization has been evidenced in the articulatory 
movement patterns in different languages [C. P. Browman and L. Gold- 
stein, Producing Speech: Contemporary Issues , 19-34 (1995); R. A. Kra- 
kow, “The articulatory organization of syllables: A kinematic analysis of 
labial and velar gestures,” Ph.D. dissertation, Yale University (1989); Q. E. 
Wang, “Are syllables units of speech motor organization? — A kinematic 
analysis of labial and velar gestures in Cantonese,” Ph.D. dissertation. 
University of Connecticut (1995)]. This study analyzed speech samples 
produced by nondemented individuals with idiopathic Parkinson’s disease 
(Hoehn and Yahr stage 2-3) who underwent posteroventral pallidotomy. 
The data were collected with the patients on and off their medications as 
well as pre- and post-operatively. The preliminary results indicated that the 
patients were able to produce stimuli with syllable-initial nasals with less 
difficulty than those containing syllable-final nasals, and as the patients’ 
motor performance improved, their ability to produce the stimuli with 
syllable-final nasals also improved. This may suggest that the speech motor 
programming and execution are different for the phonemically identical 
phonemes in syllable-initial and syllable-final positions. [Work supported 
by Rush University and NIH Grant DC-00121 to the Haskins Laborato- 
ries.] 



laSC26. Effects of discourse structure on F0 in Japanese: Raising 
versus lowering. Jennifer J. Venditti (Dept, of Linguist., Ohio State 
Univ., 222 Oxley Hall, 1712 Neil Ave., Columbus, OH 43210) 

Previous studies on English and other languages have shown that dis- 
course structure has an influence on the intonation of a string of sentences 
or phrases. With respect to fundamental frequency in particular, both 
discourse-initial raising of F0 and final lowering effects have been re- 
ported. The present study examines whether discourse structure also influ- 
ences intonation in Japanese, and if so, to what extent initial raising and 
final lowering interact to cue structure. Hierarchically organized discourses 
were constructed in which the target sentence position was varied. These 
short discourses were recorded by native speakers of Tokyo Japanese, and 



2573 J. Acoust. Soc. Am., Vol. 100, No. 4, Pt. 2, Oct. 1996 3rd Joint Meeting: Acoustical Societies of America and Japan 2573 



Downloaded 1 1 Jul 2013 to 128.95.155.147. Redistribution subject to ASA license or copyright; see http://asadl.org/terms 



a MON. AM 



the fundamental frequency contours of the target sentences in each position 
were compared. Results indicate that there is a robust raising effect on 
discourse-initial phrases, as reported for other languages. The degree of 
raising is determined by an interaction between distance from the start of 
the utterance and pitch range. In contrast, there is little effect of lowering 
on discourse-final phrases. The few phrases that did show an effect suggest 
that syntactic structure may interact with the lowering process. 



laSC27. Identification, vocal rating, and acoustical measurement of 
acted emotion. William J. Strong (Dept, of Phys. and Astron., Brigham 
Young Univ., Provo, UT 84602), Bruce L. Brown, Matthew P. 
Spackman, Rong Wang (Brigham Young Univ., Provo, UT 84602), and 
Stanley Feldstein (Univ. of Maryland, Baltimore, MD 21228-5398) 

A series of studies of eight acted emotions compared acoustical mea- 
surements, respondents’ identifications, and vocal ratings. Predictions were 
made as to which of these three approaches would be most accurate in 
differentiating between emotions within a pair. Emotion pairs were clas- 
sified as logically similar and/or perceptibly similar. Logically similar 
emotion pairs (such as sadness/depression or anger/hate) were better dis- 
tinguished by vocal/acoustic analysis. Perceptibly similar but logically dis- 
similar emotions (such as anger, fear, and joy) were better distinguished by 
respondents’ identifications. Multivariate statistics were used to compare 
the respondents’ identification space, and vocal rating space, and the 
acoustical measurement space. 



laSC28. Are selective adaptation effects independent of cognitive 
load? Donna Kat and Arthur G. Samuel (Dept, of Psych., SUNY, Stony 
Brook, NY 11794-2500) 

Selective adaptation occurs through the repeated presentation of a 
sound (the “adaptor”), and leads to a reduction in the perception of similar 
sounds. Adaptation has been used to investigate the nature of early speech 
representations. Work in this laboratory has recently demonstrated that 
perceptually restored phonemes can produce reliable adaptation effects, 
and that these effects are occurring at relatively early levels (e.g., phone- 
mic rather than lexical) of processing [A. G. Samuel, Cognitive Psychol- 
ogy (in press)]. The current study is designed to determine whether the 
adaptation effects are so low-level and automatic that they do not require 
cognitive resources. There are three conditions in the current study: (1) 
adaptation alone (control), (2) adaptation during continuous arithmetic 
problems, and (3) adaptation during continuous rhyming judgments (pre- 
sented visually). Preliminary results indicate that continuously solving 
arithmetic problems does not reduce the adaptation effect, indicating no 
general cognitive involvement in adaptation. The rhyming task tests for 
any more specific involvement of language processors. Immunity to this 
secondary task would support a very low-level, automatic basis for adap- 
tation. [Work supported by NIMH.] 



laSC29. Linguistic strategies in the first 6 months of life. Francisco 
Lacerda and Ulla Sundberg (Inst, of Linguist., Stockholm Univ., S- 106 91 
Stockholm, Sweden) 

The ability to detect word contrasts embedded in natural sentences was 
studied with 62 Swedish infants whose ages ranged from 48 to 147 days, 
using the high-amplitude sucking (HAS) technique. The infants listened to 
one of four possible pairs of natural carrier sentences, produced as child- 



directed speech, in which a target word was inserted: (1) a pair of sen- 
tences in which the contrasting target words was inserted in focal position; 
(2) a pair of sentences in which the target words were out of the sentence 
focus; (3) a pair of sentences that differed only in the position of the 
sentence focus; and (4) a control pair with two identical sentences. When 
this group of subjects is divided into two age groups, one below and the 
other above the median age for all the subjects, a pattern of interaction 
between the age groups and the experimental conditions is observed sug- 
gesting that word discrimination capacity in the younger infants may be 
disrupted by dominant F0 variations. For the older group, a tendency to 
attend to the word contrasts delivered in focal position seems to start to 
emerge. [Research supported by The Bank of Sweden Tercentenary Foun- 
dation, Grant No. 94-0435.] 



laSC30. Contrastive emphasis in elicited dialogue. Donna Erickson 
(Ctr. for Cognit. Sci., 208 Stadium East, 1961 Tuttle Park PL, Ohio State 
Univ., Columbus, OH 43210) and Use Lehiste (Ohio State Univ., 
Columbus, OH 43210) 

Phrase level characteristics of F0 and their interactions with duration 
patterns for a set of contrastively emphasized digits in elicited dialogues 
are examined. Phrases consisting of three digits plus a street name were 
elicited in a dialogue format designed to have the speaker repeat the same 
correction on one of the digits up to five or six times. Perception tests 
determined which phrases where “best perceived” and “worst perceived” 
as to emphasis. Onset, offset, and peak F 0 within the sonorant portion of 
each digit of those phrases “best perceived,” “worst perceived,” and the 
reference phrases, with no corrective emphasis, were measured. Results 
suggest that speakers produce well-perceived contrastive emphasis by 
lengthening the duration of the emphasized word, shortening the duration 
of the other words within the phrase, and producing an extensive pitch 
drop between the emphasized word and the following words in the phrase. 
In an elicited dialogue situation in which the speaker is forced to produce 
repetitively the same item with contrastive emphasis, different combina- 
tions of these two cues are used, presumably in an attempt to maximize the 
chance that emphasis will be perceived. 



laSC31. Social influence on the phonemic transformation effect. 

Verlin B. Hinsz (Psych. Dept., North Dakota State Univ., Fargo, ND 
58105), Magdalene H. Chalikia (Moorhead State Univ., Moorhead, MN 
56563), and David Matz (North Dakota State Univ., Fargo, ND 58105) 

Earlier studies found that repeated sequences of brief steady-state vow- 
els are heard as verbal forms, a phenomenon referred to as the phonemic 
transformation effect (PTE). It has also been established [Chalikia et al., J. 
Acoust. Soc. Am. 91, 2422(A) (1992)] that, when two listeners’ responses 
differ, they can identify the particular stimulus corresponding to each oth- 
er’s verbal forms. Other research suggests that unclear stimuli can be 
influenced by social forces. The present study examined social influences 
on the PTE. Participants were first asked to describe their verbal forms for 
ten vowel sequences used in previous studies. Then they were presented 
with verbal forms reported by previous listeners and were asked to match 
them to the ten stimuli. Most listeners performed the matching task, indi- 
cating that they could perceptually reorganize each stimulus. Finally, they 
were asked to listen to the sequences again and describe their verbal forms. 
About 47% of these responses corresponded to those provided by previous 
listeners, indicating that social information influenced some of the second 
responses to these stimuli. Implications concerning the perceptual inter- 
pretation of speech and social impact theory will be discussed. 



2574 



J. Acoust. Soc. Am., Vol. 100, No. 4, Pt. 2, Oct. 1996 



3rd Joint Meeting: Acoustical Societies of America and Japan 



2574 



Downloaded 1 1 Jul 2013 to 128.95.155.147. Redistribution subject to ASA license or copyright; see http://asadl.org/terms 




MONDAY MORNING, 2 DECEMBER 1996 



KAHUKU ROOM, 10:15 A.M. TO 12:15 P.M. 



Session laSP 



Signal Processing in Acoustics and Speech Communication: 

Nonlinear and Higher Order Techniques 

Gary R. Wilson, Cochair 

Applied Research Laboratories, University of Texas, P.O. Box 8029, Austin, Texas 78713-8029 

Yoshikazu Miyanaga, Cochair 

Laboratory of Intelligent Signal and Language Processing , Division of Electronics and Information Engineering, 
Hokkaido University, Kita 13-jo, Nishi 8, Kita-ku, Sapporo, Hokkaido, 060 Japan 

Contributed Papers 



10:15 

laSPl. Detection of narrowband harmonics using higher-order 
spectral methods. Martin L. Barlett, G. Douglas Meegan, and Gary R. 
Wilson (Appl. Res. Labs., Univ. of Texas, R O. Box 8029, Austin, TX 
78713-8029) 

An empirical investigation of detection performance for multiple har- 
monic components imbedded in controlled noise was performed using both 
spectral correlation and stationary bispectrum detection statistics. Conven- 
tional power spectral processing is used as a metric for detection gain 
obtained using the higher-order processing methods. Relative performance 
between power spectral and higher-order spectral methods will be pre- 
sented as a function of both signal-to-noise ratio and harmonic “falloff ” at 
fixed probability of false alarm. The empirical results will also be com- 
pared with analytic results obtained assuming asymptotic statistics [G. R. 
Wilson and K. R. Hardwicke, “Nonstationary Higher Order Spectral 
Analysis,” Appl. Res. Labs. Tech. Rep. ARL-TR-91-8, Applied Research 
Laboratories, University of Texas, Austin, TX (1991)]. [Work supported 
under the Independent Research and Development Program, Applied Re- 
search Laboratories, The University of Texas at Austin.] 



10:30 

laSP2. Fundamental frequency estimation of speech signals using a 
nonlinear observer. Taro Yoshihama, Asako Doi, and Yoshihisa Ishida 
(Dept, of Electron, and Commun., Meiji Univ., 1-1-1, Higashi-Mita, 
Tama-ku, Kawasaki, 214 Japan) 

This paper describes a new method of estimating the fundamental 
frequency of speech signals using an adaptive Fourier analysis (AFA) 
algorithm based on the presumption of signal parameters using a nonlinear 
observer. The recursive presumption of signal parameters such as funda- 
mental frequency can be effectively implemented by using the proposed 
AFA algorithm. Nonlinear observers can be used to measure nonlinear 
parameters which are included in observed signals. Recursive estimation 
procedures make it possible to follow the slow changes of signal param- 
eters to be measured. In real-time signal analysis, the recursive algorithm 
is preferred to the conventional Fourier transformation because of this 
property. The AFA algorithm, which is based on the recursive discrete 
Fourier transform (RDFT), is a recursive algorithm for the simultaneous 
estimation of the Fourier coefficients and instantaneous fundamental fre- 
quency of speech signals. This AFA algorithm is applied to speech signals 
with an arbitrary length of the transform window. Further efforts are re- 
quired for better convergence speed of the proposed algorithm, but this 
method is mostly capable of adapting and tracking nonlinear signals such 
as speech sound. 



10:45 

laSP3. Nonlinear spectrum estimation using a new neural network 
with a level estimator. Hideaki Imai, Yoshikazu Miyanaga, and Koji 
Tochinai (Hokkaido Univ., N13 W8 kita-ku, Sapporo, 060 Japan) 

This paper proposes a new nonlinear signal processing by using a 
three-layered network which is trained with self-organized clustering and 
supervised learning. The network consists of three layers, i.e., a self- 
organized layer, an evaluation layer, and an output layer. Since the evalu- 
ation layer is designed as a simple perceptron network and the output layer 
is designed as the fixed weight linear nodes, the training complexity is the 
same as the self-organized clustering and a simple perceptron network. In 
other words, quite high speed training can be realized. Generally speaking, 
since the data range usually used in signal processing is arbitrarily large, 
the network output should also cover this range. However, it may be 
difficult for only one node in the network to output these data. Instead of 
this technique, if this dynamic range is covered by using several nodes, the 
complexity of each node is reduced and the associated range is also quite 
limited. This results in a higher performance of this network than the 
conventional ones. As one of the objectives, this paper introduces the 
spectrum envelope estimation of speech waveforms. It is shown that ac- 
curate spectrum envelopes can be obtained. 



11:00 

laSP4. Feature extraction for a neural network classifier. Mark 
Wellman and Nassy Srour (U.S. Army Res. Lab., 2800 Powder Mill Rd., 
Adelphi, MD 20783-1197) 

Unattended ground sensor (UGS) networks are intended to detect and 
localize the presence of strategic relocatable targets in the theater of op- 
eration over several kilometers. Passive acoustic sensors, an integral part 
of UGS, have achieved a high level of maturity and will allow acoustic 
target classification for tracked and wheeled vehicles. Of primary impor- 
tance in the classification problem is the selection of a robust feature 
extraction technique, tolerant of both the environment and the nonstation- 
ary nature of the acoustic signatures. Several feature extraction techniques 
were used with experimental acoustic data collected from a small baseline, 
circular array. Results will be presented of the classification for acoustic 
features using a backpropagation neural network with simple power spec- 
trum, harmonic line association [J. A. Robertson, IIT Research Institute, 
in-house report], principal components [J. Mao and A. K. Jain, IEEE Trans. 
Neural Networks 6 (2) (1995)], and wavelet packet [K. Etemad and R. 
Chellappa, Proc. First Inti. Conf. on Image Processing (November 1994)] 
feature extraction techniques. 



2575 J. Acoust. Soc. Am., Vol. 100, No. 4, Pt. 2, Oct. 1996 3rd Joint Meeting: Acoustical Societies of America and Japan 2575 



Downloaded 1 1 Jul 2013 to 128.95.155.147. Redistribution subject to ASA license or copyright; see http://asadl.org/terms 



a MON. AM 



11:15 



11:45 



laSP5. Blind deconvolution based on common zeros of the z 
transforms of two observed signals. Ken’ichi Furuya and Yutaka 
Kaneda (Speech and Acoust. Lab., NTT Human Interface Labs., 3-9-11 
Midori-cho Musashino-shi, Tokyo, 180 Japan) 

A new method is proposed for recovering an unknown source signal, 
which is observed through two unknown channels characterized by finite 
impulse response filters. Unlike conventional blind deconvolution meth- 
ods, this method can recover not only the spectrum of the signal but also 
the phase characteristics. This method is based on a cost function that 
comes from a filter arrangement with a double-layer structure. The cost 
function is minimized when the common zeros of the z transforms of the 
two observed signals are extracted. If there are no common zeros between 
the system transfer functions of the two unknown channels, the common 
zeros of the observed signals represent the source signal and the noncom- 
mon zeros represent the characteristics of the two channels. Therefore, the 
source signal can be recovered by separating the common zeros from the 
other zeros, that is, by minimizing the cost function. Adaptive filters are 
used for this procedure. Computer simulations using room transfer func- 
tions as the two unknown channels demonstrate the effectiveness of this 
method. 



11:30 

laSP6. On the use of the zero-crossing analysis for multi-channel 
signal processing. Takaaki Sugihara, Shoji Kajita, Kazuya Takeda, and 
Fumitada Itakura (Dept, of Info. Elec., Nagoya Univ., Furo-cho 1, 
Chikusa-ku Nagoya-shi, 464-01 Japan) 

Since time precision in finding delays between channels is one of the 
most important issues in estimating direction of arrival (DO A) from mul- 
tichannel signals, a zero-crossing analysis is proposed as a fine and robust 
time-domain preprocessing for the DOA finding. As for the preliminary 
evaluation of noise robustness using zero-crossing information, how the 
zero-crossing points are affected is investigated, i.e., how far the zero- 
crossing points move from the original points, under the presence of noise. 
For test speech, a 48-kHz sampling of a male voice (upsampled from an 
8-kHz signal) is used after an adding machine generated white noise so 
that the SNR of the signal becomes 0 to 20 dB. The robustness of the 
zero-crossing analysis can be concluded from the summarized results: (1) 
the mean value of the distance is less than 100 //.s throughout the SNR; and 
(2) the standard deviation of the distribution becomes large as the SNR 
decreases; the standard deviation is less than 0.125 even under the condi- 
tion of 0 dB SNR. 



laSP7. Source identification using nonlinear signal processing. 

Azizul H. Quazi (Naval Undersea Warfare Ctr. Detachment, New 
London, CT 06320) 

Traditional techniques of signal processing in both the time domain 
and frequency domain are often not sufficient enough to characterize the 
complex dynamics of underwater sources that generated, radiate, and/or 
reflect sonar signals. The purpose of this paper is to analyze simulated 
signals by means of a nonlinear technique (based on mutual information), 
which is a completely new method that represents a revolutionary advance. 
Mutual information is a measure of the amount of information that one 
random variable contains about another random variable. It is also a re- 
duction of uncertainty of one random variable, which is due to the knowl- 
edge of the other. Here the results of simulated sonar signal processing are 
presented based on mutual information and which is then compared to the 
traditional correlation technique. The results indicate that the mutual in- 
formation is a more powerful tool compared to the correlation technique, 
and the results based on mutual information provide clues leading to 
source identification. 



12:00 

laSP8. Application of NORDEN transform to time-frequency 
characterization of time- varying signals. Nai-chyuan Yen (Physical 
Acoust. Branch, Naval Res. Lab., Washington, DC 20375-5320) and 
Manli C. Wu (NASA Goddard Space Flight Ctr., Greenbelt, MD 20771) 

The non-orthodox decomposition (NORDEN) transform developed by 
NASA [N. Huang et al., “The Empirical Mode Decomposition and the 
Hilbert Spectrum for Nonlinear and Nonstationary Time Series Analysis” 
(to be published)] is designed to adaptively break down a complex time- 
varying signal into a sum of several simple mode functions. Each single 
mode function can then be expressed in terms of an analytic signal whose 
amplitude and phase vary with time and can be displayed as a trace in a 
time frequency plot, the VEIN diagram, with magnitude expressed as line 
thickness or color coded. This methodology of signal analysis, unlike other 
traditional transform methods, i.e., Fourier, Wigner-Ville, and Wavelet, 
requires no special window, preselected kernel, or particularly selected 
mother wavelet and provides more detailed dynamic information about the 
signal under study in terms of its dominant components and their instan- 
taneous frequency variation in the observation time. Several examples 
from various types of acoustic signals are examined with this algorithm to 
demonstrate its signal analysis capability. 



2576 



J. Acoust. Soc. Am., Vol. 100, No. 4, Pt. 2, Oct. 1996 



3rd Joint Meeting: Acoustical Societies of America and Japan 



2576 



Downloaded 1 1 Jul 2013 to 128.95.155.147. Redistribution subject to ASA license or copyright; see http://asadl.org/terms 




MONDAY MORNING, 2 DECEMBER 1996 



MOLOKAI ROOM, 10:15 TO 11:56 A.M. 



Session laUW 



Underwater Acoustics: Transducers, Hydrophones, Arrays, Systems and Related Topics 

Guy V. Norton, Cochair 

Naval Research Laboratory, Code 7181, Acoustics Division, Stennis Space Center, Mississippi 39529-5004 

Hiroshi Ochi, Cochair 

Japan Marine Science and Technology Center, Natushima-cho, 2-15, Yokosuka, Kanagawa, 237 Japan 

Chair’s Introduction — 10:15 

Contributed Papers 



10:20 

laUWl. A simple constant beamwidth transducer. D. J. Allwright 
(Mathematical Inst., Oxford Univ., Oxford, UK) and J. Power (S. A. I. C. 
Ltd., Cambridge CB3 ORD, UK) 

Many designs for constant beamwidth transducers have appeared over 
the years, but these usually involve complicated array shading techniques 
or “lenses” applied to ultrasonic devices. With locally reacting materials, 
such as PVDF for example, it is possible to manufacture a composite 
material which automatically produces the appropriate frequency- 
dependent shading of the response in a single, continuous transducer. The 
material makes use of the capacitive (dielectric) nature of the active ele- 
ment in conjunction with a specially designed, electrically resistive layer, 
to which a single point connection is made. The resulting CR circuit acts 
as an integrator, and since the resistance to the remote parts of the trans- 
ducer is largest, the response of those parts progressively reduces as fre- 
quency is increased. In other words, the effective size of the transducer 
reduces (i.e., “shrinks”) as frequency is increased. Correct design of the 
composite provides a constant directivity characteristic. Several applica- 
tions of this “shrinker technology” are discussed, including microphones, 
loudspeakers, and hydrophones. 



10:32 

laUW2. Broad bandwidth underwater transducer. James E. Barger 
(BBN, Inc., 70 Fawcett St., Cambridge, MA 02138) 

A new underwater transducer design technique is described that pro- 
vides for ultrabroad-bandwidth directive arrays that are lightweight and 
thin (for easy towing). An example of this new technique is a planar array 
that is 1 m high, 0.75 m wide, and 0.15 m thick. This array has greater than 
60% radiation efficiency throughout the frequency band extending from 
400 Hz to 4 kHz, and a nominal power output within this band of 10 kW. 
The array is neutrally buoyant, and uses Navy type-III PZT actuators. The 
broadband goal is met by designing the transducers in such a way that their 
own stiffness reactances are mostly canceled by their masslike radiation 
reactances over the entire operating band. This feat is accomplished both 
by properly adjusting the ratio of actuator cross-sectional area to radiation 
area, and by using low dynamic transducer mass. Since the principle of 
operation requires for each transducer that its motion be influenced by its 
own radiation load, it is necessary to control each transducer’s motion so 
that the desired radiation pattern is achieved. This is done by a filtered- 
X feedforward LMS adaptive controller. 



10:44 

laUW3. Characterize reverberation spatial distribution using an 
L-shape array. Yung R Lee (Science Applications Inti. Corp., 1710 
Goodridge Dr., MS Tl-3-5, McLean, VA 22102) 

Matched-field processing (MFP) is a generalized beamforming. Instead 
of using plane-wave replica vectors in the plane-wave beamforming, in 
MFP, an acoustic model is used to calculate replica vectors in a search 
space. In addition to azimuthal and elevation information, which can be 
obtained by plane-wave beamforming, MFP also provides range and depth 
information. In active systems, scattered signals are usually modeled by 
two-way, source-to-scatterer and scatterer-to-receiver, propagation. As- 
suming that scatterers are illuminated and reradiate energy as point 
sources, the scattering field can be characterized by using just one-way, 
scatterer-to-receiver propagation. Passive MFP can then be used to map the 
spatial distribution of the scatterers. A shallow- water experiment took 
place off the Gulf of Mexico in November 1995; reverberation was mea- 
sured on an L-shaped array from a bottomed cw source. Passive MFP was 
used to characterize the spatial distribution of the measured reverberation. 
[Work supported by U.S. Navy.] 



10:56 

laUW4. Vertical array performance in shallow water with a 
directional noise field. Kwang Yoo and T. C. Yang (Naval Res. Lab., 
Washington, DC 20375) 

The performance of a vertical array for source localization and for 
target detection (array gain) is studied in shallow water under a directional 
noise field. The acoustic environment consists of a downward refractive 
sound-speed profile. At mid (e.g., 500 Hz) frequencies the vertical direc- 
tionality of the surface generated noise exhibits a notch near the signal 
arrival direction, i.e., which is close to horizontal for a submerged source. 
The ability of the beam filter in rejecting directional noise has been pre- 
viously demonstrated using conventional beamforming. The same capabil- 
ity is carried over into full field processing by using matched-beam pro- 
cessing which is an equivalent of matched-field processing conducted in 
the beam domain. It is shown with simulated data that matched-beam 
processing incorporating a beam filter of 10 deg enhances the array output 
signal-to-noise ratio (array gain) by more than 5 dB compared with con- 
ventional beamforming and matched-field processing. For white noise, 
matched-field processing yields a higher peak-to-sidelobe ratio than 
matched-beam processing with a beam filter. For directional noise, 
matched-beam processing with a beam filter yields better results than 
matched beam processing using a minimum variance correlator. The noise 
field in the background ambiguity surface (the sidelobes) has been sup- 
pressed by the beam filter. 



2577 J. Acoust. Soc. Am., Vol. 100, No. 4, Pt. 2, Oct. 1996 3rd Joint Meeting: Acoustical Societies of America and Japan 2577 



Downloaded 1 1 Jul 2013 to 128.95.155.147. Redistribution subject to ASA license or copyright; see http://asadl.org/terms 



a MON. AM 



11:08 



11:32 



laUW5. A new towed array shape estimation scheme for real-time 
sonar systems. Feng Lu, Evangelos Milios (Comput. Sci. Dept., York 
Univ., 4700 Keele St., Downsview, ON M3J 1P3, Canada), and Stergios 
Stergiopoulos (Defence Res. Establishment Atlantic, Dartmouth, NS B2Y 
3Z7, Canada) 

In real-time towed array systems, performance degradation of array 
gain occurs when beamforming is carried out on the sensor outputs of a 
line array which is not straight. In this paper, a new method is proposed for 
array shape estimation. The procedure consists of two steps. First, the 
tow-point-induced motion is formulated in the time domain based on the 
constraints from the tow-point compass-sensor readings and from a dis- 
cretized Paidoussis equation. At each time instance, the shape estimate is 
solved from a linear system of equations. It is shown that this solution is 
equivalent to a previous frequency-domain solution, while the new ap- 
proach is much simpler. In the second step, the tail compass-sensor data 
are used to adjust the overall array shape. By noting that variations in the 
ship speed lead to a distortion in the normalized time axis, the predicted 
tail displacements are first registered with the tail sensor readings along the 
time axis. Then distortions in the estimated array shape over its length can 
be compensated accordingly. A slow-changing bias between sensor zeros is 
also modeled in order to remove systematic sensor errors. The effective- 
ness of the new algorithm is demonstrated with real sea-trial data. 

11:20 

laUW6. Simulated performance of an acoustic modem using 
phase-modulated signals in a time-varying, shallow-water 
environment. Christian Bjerrum-Niese and Leif Bj^mp (Dept, of 
Industrial Acoust., Tech. Univ. of Denmark, DK-2800 Lyngby, Denmark) 

Underwater acoustic modems using coherent modulation, such as 
phase-shift keying, have proven to efficiently exploit the bandlimited un- 
derwater acoustical communication channel. However, the performance of 
an acoustic modem, given as maximum range and data and error rate, is 
limited in the complex and dynamic multipath channel. Multipath arrivals 
at the receiver cause phase distortion and fading of the signal envelope. 
Yet, for extreme ratios of range to depth, the delays of multipath arrivals 
decrease, and the channel impulse response coherently contributes energy 
to the signal at short delays relative to the first arrival, while longer delays 
give rise to intersymbol interference. Following this, the signal-to- 
multipath ratio (SMR) is introduced. It is claimed that the SMR determines 
the performance rather than the signal-to-noise ratio (SNR). Using a ray 
model including temporal variations of the shallow-water environment, the 
performance of the acoustic modem may be estimated. Simulations indi- 
cate that optimum performance is not necessarily found at receiver depths 
yielding the maximum total signal level, since the SMR may correspond- 
ingly be low due to strong intersymbol interference. [Work sponsored by 
the Danish Technical Research Council.] 



laUW7. Development of the acoustic packet data relay 
communication system and the results of its sea trial. Hiroshi Ochi, 
Takuya Shimura, Yasutaka Amitani, and Toshio Tsuchiya (Deep-Sea 
Technol. Dept., Japan Marine Sci. and Technol. Ctr., Natsushima-cyo 2-15, 
Yokosuka, 237 Japan) 

In order to recover the data obtained by various sensors deployed in the 
ocean, a study was made concerning the acoustic digital data communica- 
tion system [Ochi et al . , Proc. Meeting Marine Acoust. Soc. Jpn., pp. 
85-86 (1996) (in Japanese)]. The data communication system was con- 
structed by up to 99 of the same type of equipment in the area, and its 
specifications are: a half-duplex communication, FSK modulation, a 
2500bps data rate, HDLC (high-level data link control procedure) protocol 
packet transmission, CRC (cyclic redundancy check) code, and frame re- 
transmission for error detection/correction. The sea trial of this system was 
done at about a 4000-m depth area. Three sets of equipment were deployed 
in the ocean, and one of them hangs from the ship. Then, relay data 
transmission, among the three sets of equipment, was carried out. The 
detail of that sea trial will be shown at this presentation. 



11:44 

laUW8. The effectiveness of a thin wedge design anechoic lining for 
long-time signature measurements.. Walter H. Boober (NUWC Code 
8211, Bldg. 1171, Newport, RI 02841) and Scott Emery (Vector Res. Co., 
Inc., Rockville, MD 20852) 

A thin wedge design anechoic lining has proven to be very effective for 
long time signature measurements at frequencies above 10 kHz. Partial 
treatment on five of six surfaces in a 60X40X35-ft 3 depth tank resulted in 
accurate transfer functions of a test transducer using late sample delays for 
long-time signature waveforms. Comparison of data resulting from delays 
prior to wall/surface reflections with delays as late as 200 ms revealed 
negligible contribution, if any, from reflections. The “reflection” data 
when graphed over “free-field” data were indistinguishable. Tests were 
run using a directional array projector and a horizontally omnidirectional 
projector. The hydrophone was an H-52; a 5.1-cm vertical line array, om- 
nidirectional in the horizontal plane. This allowed some discrimination 
from surface and bottom reflections. Reflections from the untreated wall 
were strong contributors while using the omnidirectional projector and 
were revealed in radiation patterns of the directional array. These tests 
under real world everyday measurement conditions demonstrate the effec- 
tiveness of the panels for long pulse times with late delay sampling in 
constrained boundaries. 



2578 



J. Acoust. Soc. Am., Vol. 100, No. 4, Pt. 2, Oct. 1996 



3rd Joint Meeting: Acoustical Societies of America and Japan 



2578 



Downloaded 1 1 Jul 2013 to 128.95.155.147. Redistribution subject to ASA license or copyright; see http://asadl.org/terms 




